28 Concentration Inequalities II
Sub-Gaussian random variables
A random variable X with mean \mu is called Sub-Gaussian with parameter \sigma (X \sim SubGau (\sigma)) if
\mathbb{E}_{X} \left[ e^{s (X - \mu)} \right] \leq \exp \left[ \frac{ s^{2} \sigma^{2} }{ 2 } \right].
The definition can equivalently be expressed in terms of bounds on the tail of X. Let X \sim SubGau (\sigma) and \mu = \mathbb{E}_{X} [X]. Then for any t > 0,
\mathbb{P}_{X} \left( X - \mu \geq t \right) \leq \exp \left[ \frac{ - t^{2} }{ 2 \sigma^{2} } \right].
Hoeffding’s inequality
Hoeffding’s inequality provides an upper bound on the probability that the sum of bounded independent random variables deviates from its expected value by more than a certain amount.
Hoeffding’s Lemma
Let X be a bounded random variable with a \leq X \leq b and \mu = \mathbb{E}_{X} [x]. Then for all s > 0,
\mathbb{E}_{X} [e^{s (x - \mu)}] \leq \exp \left[ \frac{ s^{2} (b - a)^{2} }{ 8 } \right].
Equivalently, Hoeffding’s lemma states that any random variable X \in [a, b] is a subGaussian variable
X \in SubGau \left( \frac{ (b - a)^{2} }{ 4 } \right).
Hoeffding’s inequality
Let X_{1}, \dots, X_{n} be bounded independent random variables such that X_{i} \in [a_{i}, b_{i}], \forall i. Let X = \sum_{i=1}^{n} X_{i} and \mu = \mathbb{E}_{X} (x). Then for t > 0
\mathbb{P}_{X} \left( x - \mu \geq t \right) \leq \exp \left[ \frac{ - 2 t^{2} }{ \sum_{i=1}^{n} (b_{i} - a_{i})^{2} } \right].
Another version is to bound the estimated mean.
Let X_{1}, \dots, X_{n} be bounded independent random variables such that X_{i} \in [a_{i}, b_{i}], \forall i. Let \bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_{i} and \mu = \mathbb{E}_{\bar{X}} (x). Then for t > 0
\mathbb{P}_{\bar{X}} \left( x - \mu \geq t \right) \leq \exp \left[ -\frac{ 2 n^{2} t^{2} }{ \sum_{i=1}^{n} (b_{i} - a_{i})^{2} } \right]
Martingales
Martingale sequence
Given a sequence of random variables X_{1}, \dots, X_{\infty}, define another sequence of random variables Y_{1}, \dots, Y_{\infty} where Y_{n} is a measurable function of X_{1}, \dots X_{n}
Y_{n} = f (X_{1}, \dots, X_{n}).
For all Y_{n}, if \mathbb{E}_{Y_{n}} [y_{n}] is finite and
\mathbb{E}_{Y_{n + 1} \mid X_{1}, \dots, X_{n}} [Y_{n + 1} \mid X_{1}, \dots, X_{n}] = Y_{n},
then Y_{1}, \dots, Y_{\infty} is a martingale sequence with respect to X_{1}, \dots, X_{\infty}.
Martingale difference sequence
Given a sequence of random variables X_{1}, \dots, X_{\infty}, define another sequence of random variables Z_{1}, \dots, Z_{\infty} where Z_{n} is a measurable function of X_{1}, \dots X_{n}
Z_{n} = f (X_{1}, \dots, X_{n}).
For all Z_{n}, if E_{Z_{n}} [z_{n}] is finite and
\mathbb{E}_{Z_{n + 1} \mid X_{1}, \dots, X_{n}} [Z_{n + 1} \mid X_{1}, \dots, X_{n}] = 0
then Z_{1}, \dots, Z_{\infty} is a martingale difference sequence with respect to X_{1}, \dots, X_{\infty}.
A common Martingale sequence
If X_{1}, \dots, X_{\infty} is a sequence of independent random variables where \mathbb{E}_{X_{i}} [x_{i}] = 0, \forall X_{i}, then Y_{1}, \dots, X_{\infty} with Y_{n} = \sum_{i = 1}^{n} X_{i} is martingale sequence.
\begin{aligned} \mathbb{E}_{Y_{n + 1} \mid X_{1}, \dots, X_{n}} & = \mathbb{E}_{Y_{n} + X_{n + 1} \mid X_{1}, \dots, X_{n}} \\ & = \mathbb{E}_{Y_{n} \mid X_{1}, \dots, X_{n}} + \mathbb{E}_{X_{n + 1} \mid X_{1}, \dots, X_{n}} \\ & = Y_{n} + \mathbb{E}_{X_{n + 1}} \\ & = Y_{n} \end{aligned}
Suppose Y_{1}, \dots, Y_{n} is a martingale sequence with respect to X_{1}, \dots, X_{n}. then the sequence Z_{1}, \dots, Z_{\infty} with Z_{n} = Y_{n} - Y_{n - 1} is a martingale difference sequence.
\begin{aligned} \mathbb{E}_{Z_{n + 1} \mid X_{1}, \dots, X_{n}} & = \mathbb{E}_{Y_{n + 1} - Y_{n} \mid X_{1}, \dots, X_{n}} \\ & = \mathbb{E}_{Y_{n + 1} \mid X_{1}, \dots, X_{n}} - \mathbb{E}_{Y_{n} \mid X_{1}, \dots, X_{n}} \\ & = Y_{n} - Y_{n} \\ & = 0 \end{aligned}
Azuma Inequality
For a sequence of Martingale difference sequence of bounded random variable Z_{1}, \dots, Z_{\infty} with Z_{i} \in [a_{i}, b_{i}], \forall i, then:
\mathbb{P}_{Z_{1}, \dots, Z_{n}} \left[ \sum_{i = 1}^{n} z_{i} \geq t \right] \leq \exp \left[ \frac{ - 2 t^{2} }{ \sum_{i = 1}^{n} (b_{i} - a_{i})^{2} } \right].
McDiarmid’s Inequality
Given a function f: \mathbb{R}^{n} \to \mathbb{R} with the bounded difference property, that is, the maximum change of the function output induced by replacing any x_{i} with x'_{i} is bounded by c_{i},
\lvert f (x_{1}, \dots, x_{i}, \dots, x_{n}) - f (x_{1}, \dots, x'_{i}, \dots, x_{n}) \rvert \leq c_{i},
then the function f of independent random variables X_{1}, \dots, X_{n} satisfies
\mathbb{P}_{X_{1}, \dots, X_{n}} \left[ f (X_{1}, \dots, X_{n}) - \mathbb{E}_{X_{1}, \dots, X_{n}} \left[ f (X_{1}, \dots, X_{n}) \right] \geq t \right] \leq \exp \left[ \frac{ - 2 t^{2} }{ \sum_{i=1}^{n} c_{i}^{2} } \right].